Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get writeable buffer #120

Closed
wants to merge 2 commits into from
Closed

Conversation

victorstewart
Copy link
Contributor

i'm writing a custom extension that might invoke a 3rd party serialization method that takes a buffer to serialize into, and need to avoid an unnecessary copy (aka by first serializing into a temporary buffer then copying into the bitsery generate buffer) as these object might be many, many bytes... so i added 2 methods to extract the buffer pointer to write into.

const size_t newOffset = _currOffset + size;
maybeResize(newOffset, TResizable{});
_currOffset = newOffset;
return _beginIt + static_cast<diff_t>(_currOffset);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either I miss something, or this is a bug.
Instead of

    _currOffset = newOffset;
    return _beginIt + static_cast<diff_t>(_currOffset);

You should probably do like this:

    auto res = _beginIt + static_cast<diff_t>(_currOffset);
    _currOffset = newOffset;
    return res;

@fraillt
Copy link
Owner

fraillt commented Feb 9, 2025

In general I like the idea, but it's is not fully executed, and a lot of things are still missing, in order to land it as part as core functionality.
There are some of them:

  • I think function name should be different, there already exists writeBuffer with different signature, but does very different thing. I would name this function differently as well, maybe something like directWritePointer, allocateForDirectWrite or something along these lines...
  • all other existing adapters should also have this function with body static_assert(...," **this*** function is not supported by **this** adapter")
  • lastly, alignment... it's too important to ignore, and it should be part of API. What is worse, is that existing adapters doesn't enforce this in anyway. You can initialize input or output buffer with _begin pointer not being aligned at any size... which means that you cannot ask for "direct access pointer" that is aligned.

In the end, I think it would make more sense to write new input/output adapter together with extension. There's already example of one in #124.

@fraillt fraillt closed this Feb 9, 2025
@Diarica
Copy link

Diarica commented Feb 11, 2025

In general I like the idea, but it's is not fully executed, and a lot of things are still missing, in order to land it as part as core functionality. There are some of them:

  • I think function name should be different, there already exists writeBuffer with different signature, but does very different thing. I would name this function differently as well, maybe something like directWritePointer, allocateForDirectWrite or something along these lines...
  • all other existing adapters should also have this function with body static_assert(...," **this*** function is not supported by **this** adapter")
  • lastly, alignment... it's too important to ignore, and it should be part of API. What is worse, is that existing adapters doesn't enforce this in anyway. You can initialize input or output buffer with _begin pointer not being aligned at any size... which means that you cannot ask for "direct access pointer" that is aligned.

In the end, I think it would make more sense to write new input/output adapter together with extension. There's already example of one in #124.

For alignment, I think it's definitely should be guranteed by user, user must know what they are doing before they use uncopy deserialize.
Gurantee the alignment requires : 1, user passed buffer should be aligned as well 2, The serialized buffer should be added paddings(one straightforward way is simply copy the whole object memory into serialize buffer, it's will waste some memory, but do guarantee the alignment)
This is already can be done use the extension in #124(require a bit modification, no need to serialize the buffer size, the size can be know at compile time), can directly serialize/des everything with original buffer.

And in-action, I don't think all the buffer data all requires the alignment in memory, for instances : flac, mp3 audio data, they can just be bitstream and pass into a decoder way to play it, only for wav, directly PCM data is required for 4b alignment.
and textures, the alignment requirement only gurantee by GPU memory, for memory-side it's still can be 1 byte alignment.
On the other hand, for the forced aligned data, just use container4b to copy it.

@Diarica
Copy link

Diarica commented Feb 11, 2025

In general I like the idea, but it's is not fully executed, and a lot of things are still missing, in order to land it as part as core functionality. There are some of them:

  • I think function name should be different, there already exists writeBuffer with different signature, but does very different thing. I would name this function differently as well, maybe something like directWritePointer, allocateForDirectWrite or something along these lines...
  • all other existing adapters should also have this function with body static_assert(...," **this*** function is not supported by **this** adapter")
  • lastly, alignment... it's too important to ignore, and it should be part of API. What is worse, is that existing adapters doesn't enforce this in anyway. You can initialize input or output buffer with _begin pointer not being aligned at any size... which means that you cannot ask for "direct access pointer" that is aligned.

In the end, I think it would make more sense to write new input/output adapter together with extension. There's already example of one in #124.

And I would like to know, if we serialize a 4byte data to drive, will it gurantee the alignment in file buffer side? Doesn't mean directly copy whole 32 bits to buffer, I mean it can gurantee the start address inside buffer of this 4 byte data can be multiple of 4?
If it is, I think gurantee the alignment and without copy buffer is implementable.

@Diarica
Copy link

Diarica commented Feb 11, 2025

In general I like the idea, but it's is not fully executed, and a lot of things are still missing, in order to land it as part as core functionality. There are some of them:

  • I think function name should be different, there already exists writeBuffer with different signature, but does very different thing. I would name this function differently as well, maybe something like directWritePointer, allocateForDirectWrite or something along these lines...
  • all other existing adapters should also have this function with body static_assert(...," **this*** function is not supported by **this** adapter")
  • lastly, alignment... it's too important to ignore, and it should be part of API. What is worse, is that existing adapters doesn't enforce this in anyway. You can initialize input or output buffer with _begin pointer not being aligned at any size... which means that you cannot ask for "direct access pointer" that is aligned.

In the end, I think it would make more sense to write new input/output adapter together with extension. There's already example of one in #124.

And I would like to know, if we serialize a 4byte data to drive, will it gurantee the alignment in file buffer side? Doesn't mean directly copy whole 32 bits to buffer, I mean it can gurantee the start address inside buffer of this 4 byte data can be multiple of 4? If it is, I think gurantee the alignment and without copy buffer is implementable.

Okay I seen, there is no gurantee for the serialized buffer, only way to keep the alignment is add paddings, but it's still implementable! As the way I said, we need input and output adpater work together, it's simple, user tell us the alignment requirement of non copying buffer, and we add, give me a sec, I'm testing and will post result here soon.

@fraillt
Copy link
Owner

fraillt commented Feb 11, 2025

Thanks for working on it :)
I actually got an idea as well on how to implement this on existing adapters.
Buffer adapters has _beginIt which is a pointer basically, so we can verify alignment of initial buffer pointer.
As long as _beginIt alignment is >= then required by "allocateForDirectWrite", we're good to go! if not, then we the only option we have is panic. Practically we should probably use something like assert(is_sufficiently_aligned(_beginIt)).
Basically full implementation could probably look something like this:

template <size_t ALIGNMENT>
TValue* allocateForDirectWrite(size_t size)
{
  // first make sure that initial buffer has sufficient alignment ( >= than user type needs)
  assert(is_sufficiently_aligned<ALIGNMENT>(_beginIt));
  auto padding = (_currOffset + ALIGNMENT - _currOffset % ALIGNMENT) % ALIGNMENT;
  // todo write 0s as padding bytes
  auto res = _beginIt + static_cast<diff_t>(_currOffset + padding)
  _currOffset = _currOffset + size + padding
  return res
}

On the call side you would probably do something like

auto ptrTmp = writer.allocateForDirectWrite<alignof(T)>(size);
auto ptr = std::assume_aligned<alignof(T)>(ptrTmp); // to enable more efficient code generation

@Diarica
Copy link

Diarica commented Feb 11, 2025

Thanks for working on it :) I actually got an idea as well on how to implement this on existing adapters. Buffer adapters has _beginIt which is a pointer basically, so we can verify alignment of initial buffer pointer. As long as _beginIt alignment is >= then required by "allocateForDirectWrite", we're good to go! if not, then we the only option we have is panic. Practically we should probably use something like assert(is_sufficiently_aligned(_beginIt)). Basically full implementation could probably look something like this:

template <size_t ALIGNMENT>
TValue* allocateForDirectWrite(size_t size)
{
  // first make sure that initial buffer has sufficient alignment ( >= than user type needs)
  assert(is_sufficiently_aligned<ALIGNMENT>(_beginIt));
  auto padding = (_currOffset + ALIGNMENT - _currOffset % ALIGNMENT) % ALIGNMENT;
  // todo write 0s as padding bytes
  auto res = _beginIt + static_cast<diff_t>(_currOffset + padding)
  _currOffset = _currOffset + size + padding
  return res
}

On the call side you would probably do something like

auto ptrTmp = writer.allocateForDirectWrite<alignof(T)>(size);
auto ptr = std::assume_aligned<alignof(T)>(ptrTmp); // to enable more efficient code generation

Okay, I'm late, your solution is more straightforward, but can be optimize, we only need one modular operation and output adapter could only provide single noncopying write function.
I think for directly writing is enough, are you considering integrate this function into actual library extension?
and last thing about safety is to check user deserialization buffer is aligned like this extension said.
Last things last, Thanks for your contribution! bitsery is an amazing binary serialization library, I've already built whole asset pipeline based on bitsery, at past I just hand write and need works on such offsets, it's fast but annoying and no compression, but bitsery changed everything!

This is the extension code, with caution : There is only be single alignment value available for all the noncopyable buffer(otherwise alignment value is multiple of others buffer like 32, 16, 8, 4), this might need some safety improvements(currently it's allow each buffer to say a wanted alignment)
The deserialization buffer, MUST BE ALIGNED AS SAME AS noncopyable buffer, otherwise nothing will actually aligned.

namespace bitsery {
namespace ext {

template <typename TSize>
class BufferDataNoCopying
{
public:

  explicit BufferDataNoCopying(TSize& size, uint16_t alignment):_size{size},_alignment(alignment) {}

    template<typename Ser, typename T, typename Fnc>
  void serialize(Ser &ser, const T &obj, Fnc &&fnc) const {
    auto& writer = ser.adapter();
    writer.template writeBytes<4>(static_cast<uint32_t>(_size));

    
    auto dummyNeed = writer.currentWritePos() % _alignment;
    
    if (dummyNeed != 0)
    {
      dummyNeed = _alignment - dummyNeed;
      // Because we need to copy these meaningless data, so just point to an random address and copy these data.
      // Avoid any extra allocation.
      writer.template writeBuffer<1>(((const uint8_t*)&writer), dummyNeed);
    }
    
    writer.template writeBuffer<1>(static_cast<const uint8_t*>(obj), _size);
  }

  template<typename Des, typename T, typename Fnc>
  void deserialize(Des &des, T &obj, Fnc &&fnc) const {
    auto& reader = des.adapter();
    
    reader.template readBytes<4>(_size);

    
    auto dummyNeed = reader.currentReadPos() % _alignment;
    
    if (dummyNeed != 0)
    {
      dummyNeed = _alignment - dummyNeed;
      //Skip the meaningless.
      reader.currentReadPos(reader.currentReadPos() + dummyNeed);

    }
    reader.readBufferNoCopying((obj), _size);

    
  }

private:
  TSize& _size;
  uint16_t _alignment;
};
}


namespace traits {
template<typename TSize, typename T>
struct ExtensionTraits<ext::BufferDataNoCopying<TSize>, T>
{
  using TValue = void;
  static constexpr bool SupportValueOverload = false;
  static constexpr bool SupportObjectOverload = true;
  static constexpr bool SupportLambdaOverload = false;
};
}

}

@Diarica
Copy link

Diarica commented Feb 12, 2025

Thanks for working on it :) I actually got an idea as well on how to implement this on existing adapters. Buffer adapters has _beginIt which is a pointer basically, so we can verify alignment of initial buffer pointer. As long as _beginIt alignment is >= then required by "allocateForDirectWrite", we're good to go! if not, then we the only option we have is panic. Practically we should probably use something like assert(is_sufficiently_aligned(_beginIt)). Basically full implementation could probably look something like this:

template <size_t ALIGNMENT>
TValue* allocateForDirectWrite(size_t size)
{
  // first make sure that initial buffer has sufficient alignment ( >= than user type needs)
  assert(is_sufficiently_aligned<ALIGNMENT>(_beginIt));
  auto padding = (_currOffset + ALIGNMENT - _currOffset % ALIGNMENT) % ALIGNMENT;
  // todo write 0s as padding bytes
  auto res = _beginIt + static_cast<diff_t>(_currOffset + padding)
  _currOffset = _currOffset + size + padding
  return res
}

On the call side you would probably do something like

auto ptrTmp = writer.allocateForDirectWrite<alignof(T)>(size);
auto ptr = std::assume_aligned<alignof(T)>(ptrTmp); // to enable more efficient code generation

Well, actually with consider of branch predition, double modular is probably fast than per modular + if... But I think the predit sucessful chance is great, because the chance of no need padding is low I think(depends on previous serialization fields)
Well this is a really small optimization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants